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Yersinia pestis, the etiologic agent of plague, is closely related to Yersinia pseudotuberculosis evolutionarily but has a very 
different mode of infection. The RNA-binding regulatory protein, Hfq, mediates regulation by small RNAs (sRNAs) and 
is required for virulence of both Y. pestis and Y. pseudotuberculosis. Moreover, Hfq is required for growth of Y. pestis, but 
not V. pseudotuberculosis, at 37°C. Together, these observations suggest that sRNAs play important roles in the virulence 
and survival of Y. pestis, and that regulation by sRNAs may account for some of the differences between Y. pestis and V. 
pseudotuberculosis. We have used a deep sequencing approach to identify 31 sRNAs in Y. pestis. The majority of these 
sRNAs are not conserved outside the Yersiniae. Expression of the sRNAs was confirmed by Northern analysis and we 
developed deep sequencing approaches to map 5' and 3' ends of many sRNAs simultaneously. Expression of the majority 
of the sRNAs we identified is dependent upon Hfq. We also observed temperature-dependent effects on the expression 
of many sRNAs, and differences in expression patterns between V. pestis and Y. pseudotuberculosis. Thus, our data suggest 
that regulation by sRNAs plays an important role in the adaptation to both flea vector and mammalian host, and that 
regulation by sRNAs may contribute to the phenotypic differences between Y. pestis and Y. pseudotuberculosis. 



Introduction 

Yersinia pestis, the etiologic agent of plague, continues to pose a 
threat to human health both naturally and as a bioweapon. Y. pes- 
tis is closely related to Yersinia pseudotuberculosis. Both species are 
human pathogens and are believed to have diverged from each other 
-1,500-20,000 y ago; 1 ' 2 -75% of their genes share > 97% nucleo- 
tide identity. 3 Despite their genetic similarity, the diseases caused 
by these two organisms vary greatly. Y. pestis infects both mam- 
malian and arthropod hosts and is typically transmitted to humans 
through the bite of an infected flea. In humans, infection by Y. pestis 
usually manifests itself as bubonic and pneumonic plague. 4 In con- 
trast, Y. pseudotuberculosis is an enteropathogen that causes a gastro- 
intestinal disease transmitted by the fecal-oral route. 5 

Small RNAs (sRNAs) serve as important components of many 
regulatory circuits in bacteria. sRNAs are typically non-coding 
RNA molecules of < 500 nt, transcribed from intergenic regions. 6 
The majority of sRNAs characterized to date have been shown 
to downregulate gene expression at the post-transcriptional level 
by base-pairing with target mRNAs. These pairing interactions 
result in changes in transcription attenuation, translation initia- 
tion or mRNA stability. 6 

Hfq is an RNA-binding protein that is required for the sta- 
bility and/or regulatory function of many sRNAs. 7 Hfq is also 
required for the virulence of many pathogenic bacteria, 8 including 
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Y. pestis' 1 and Y. pseudotuberculosis,™ suggesting that sRNAs are 
key regulators of virulence genes in these species. Moreover, Hfq 
is required for efficient biofilm formation and gut blockage in 
the flea, important processes for transmission to a mammalian 
host," and for the growth of some Y. pestis strains, but not Y. pseu- 
dotuberculosis, at 37°C. 12 Strikingly, Hfq amino acid sequence is 
100% identical across eight sequenced strains of Y. pestis and four 
of Y. pseudotuberculosis, suggesting that regulation by sRNAs 
rather than Hfq itself contributes to the difference in viability of 
Y. pestis and Y. pseudotuberculosis hfq mutants at 37°C. 

Most studies of sRNAs have focused on Escherichia coli, for 
which > 100 sRNAs have been identified, 13 " 15 and deep sequenc- 
ing approaches have led to the identification of similar numbers 
in other bacterial species. 16 " 22 Although all bacterial species are 
expected to express numerous sRNAs, conservation of sRNAs is 
generally poor, so the specific sRNA pool differs widely between 
species. Until recently, very few sRNAs had been identified in 
any Yersinia species. A recent study utilized a deep sequencing 
approach to identify 150 putative sRNAs in Y. pseudotuberculosis. 25 
The majority of the putative sRNAs identified are conserved in Y. 
pestis, but the expression and dependence on Hfq of five sRNAs 
(seven were tested) differs between the closely related species. The 
authors also showed that deletion of specific sRNAs in Y. pseudotu- 
berculosis leads to attenuation of the pathogen in a mouse model of 
infection and that the inactivation of an sRNA in Y. pestis reduces 
virulence in a mouse model of pneumonic plague. 23 

In this work, we utilized a deep sequencing approach to iden- 
tify putative sRNAs expressed in Y. pestis. We confirmed expres- 
sion of 31 sRNAs by Northern analysis, of which only 17 match 
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previously identified putative sRNAs. 23 We developed genome- 
scale 5' and 3' RACE (rapid amplification of cDNA ends) 
approaches to map the 5' and 3' ends of most of the sRNAs. We 
observed a wide variety of expression patterns that depend upon 
temperature and the presence of Hfq. All of the sRNAs we identi- 
fied are conserved in Y. pseudotuberculosis (most are 100% identi- 
cal), most are conserved in other Yersinia species, but fewer than 
half are conserved in E. coli. We observed detectable expression in 
Y. pseudotuberculosis for all but one of the sRNAs, but the temper- 
ature- and Hfq-dependent expression patterns of many sRNAs 
differed between Y. pestis and Y. pseudotuberculosis. Thus, our 
data suggest that differences in sRNA expression may contribute 
to the differences in Y. pestis and Y. pseudotuberculosis biology. 

Identification of putative sRNAs in Y. pestis. To identify 
novel Y. pestis sRNAs, we purified RNA from Y. pestis KIM6+ 
grown at 37°C and constructed a cDNA library for Illumina 
sequencing. Following sequencing and mapping of reads to the 
reference genome, we identified genomic regions with contiguous 
sequence reads that partially or fully overlap an intergenic region 
(JCVI genome annotation), with at least one position having > 
500 mapped sequence reads. Thus, we generated a list of 50 puta- 
tive sRNAs with a high level of confidence. We excluded repeti- 
tive sequence, although we noted that many sequences mapped to 
repetitive sequence partially overlapping predicted transposases. 
Fully antisense RNAs could not be identified due to the lack of 
strand information in the sequencing data. 

Validation and characterization of sRNAs by northern blot. To 
confirm the presence of the putative sRNAs, and to determine their 
expression profiles in Y. pestis and Y. pseudotuberculosis, we isolated 
RNA from both species at 28°C and 37°C in hfq* cells, isogenic 
Ahfq mutants and Ahfq strains complemented with a multi-copy 
plasmid that encodes hfq} 2 We then performed Northern analysis 
with radiolabeled oligonucleotides designed to probe each sRNA. 
The deep sequencing data did not provide strand information so 
sRNAs were first probed on the plus strand, and any that could not 
be detected were then probed on the minus strand. This approach 
confirmed that 32 of the 50 putative sRNAs are expressed at a 
detectable level and have a size consistent with that of an sRNA (< 
500 nt). All Northern-confirmed sRNAs are listed in Table 1 and 
shown in Figure SI. Representative examples of the northern blots 
are shown in Figure 1A. Confirmed sRNAs were assigned "Ysr" 
{Yersinia sRNA) names, in accordance with previously identified 
sRNAs in Y. pseudotuberculosis. 23 One putative sRNA is in fact a 
protein-coding mRNA for the gene rmf(see below) . Of the 31 con- 
firmed sRNAs, 14 have not been described previously, and of the 
remaining 17, only five have been detected by a method other than 
deep sequencing. 23 

We observed a remarkable variety of expression patterns with 
respect to temperature, dependence upon hfq and species. We 
used an unsupervised learning algorithm to group the sRNAs 
into seven clusters, based on their expression patterns (Fig. IB). 
These clusters highlight expression patterns that are common to 
multiple sRNAs. Clusters 1 and 2 consist largely of sRNAs that 
are constitutively expressed in both species, regardless of tem- 
perature or the presence of hfq (Ysrl55/RyfD, Ysrl56/Ffs, Ysrl6l, 
Ysrl63, Ysrl77, Ysrl82/6S RNA, Ysrl83/SroG, Ysrl46.2/187, 



Ysrl51/RnpB, Ysr88/152, Ysr73/169, Ysr65/175 and Ysrl86/ 
CsrC). Cluster 3 consists of sRNAs that are expressed similarly 
in both species but whose expression is dependent upon the pres- 
ence of hfq (Ysrl45/157, Ysrl59/CyaR, Ysrl64 and Ysrl49/181). 
Cluster 4 consists of sRNAs that are expressed in both species but 
whose expression is dependent upon the presence of hfq only in 
Y. pestis (Ysrl51/RnpB, Ysrl48/153/GlmZ, Ysr7/154/MicA and 
Ysrl58; the protein-coding RNA, YsrU3/rmf also fell in Cluster 
4). Cluster 5 consists of RNAs that are expressed in both species, 
whose expression in Y. pestis is dependent upon the presence of 
hfq, and whose expression is higher at 37°C than 28°C (Ysrl67, 
Ysrl70, Ysrl71 and Ysrl74). We also observed sRNAs whose 
expression increased in the absence of hfq in Y. pseudotuberculosis 
but not Y. pestis (Ysr23/160 and Ysrl65 from Cluster 6), whose 
expression in both species increased in the absence of hfq at 28°C 
but decreased in the absence of hfq at 37°C (Ysrl79/CsrB from 
Cluster 6) or whose expression was only detectable in Y. pestis 
(Ysrl72, the sole member of Cluster 7). 

Interestingly, for sRNAs in Cluster 5, as well as Ysrl58, 
Ystl73lrmf (Cluster 4) and Ysrl65 (Cluster 6), deletion of hfq 
in Y. pseudotuberculosis had no substantial effect on sRNA levels 
whereas expression of hfq from a multi-copy plasmid in the Ahfq 
strain resulted in a substantial decrease in sRNA levels relative to 
hfq* (Fig. IB). This result was observed for independent biologi- 
cal replicates and may be due to aberrant effects of Hfq overex- 
pression. Consistent with this, probing the same membranes with 
radiolabeled oligonucleotide specific to hfq mRNA revealed that 
hfq is grossly overexpressed in plasmid-complemented Y. pseudo- 
tuberculosis but not Y. pestis (Fig. 1A; Fig. S2). The hfq northern 
blot data also indicated that hfq transcript levels are substantially 
lower at 37°C than at 28°C in both Y. pestis and Y. pseudotu- 
berculosis, suggesting that varying Hfq levels may contribute to 
temperature-dependent changes in expression of some sRNAs. 

Effects of hfq and temperature on sRNA levels in other 
Y. pestis and Y. pseudotuberculosis strains. To determine whether 
the different effects of hfq and temperature on sRNA expression 
between Y. pestis and Y. pseudotuberculosis are species-specific rather 
than strain-specific, we measured expression of three sRNAs, 
Ysrl70, Ysrl72 and Ysrl79/CsrB, in another Y. pestis strain (C092) 
and three other Y. pseudotuberculosis strains (PTB51c, PTB57c and 
PTB54c; Fig. 1C). The effect of temperature on sRNA expression 
was consistent across all strains. Specifically, in both Y. pestis and 
Y. pseudotuberculosis, expression of Ysrl70 is higher at 37°C than 
at 28°C, whereas expression of Ysrl79 is higher at 28°C than at 
37°C (expression of Ysrl72 is unaffected by temperature). There 
are species-specific differences in expression patterns for two of the 
sRNAs that were observed for all strains tested: the temperature 
dependence of Ysrl79 expression is greater in Y. pseudotuberculosis 
than in Y. pestis (Fig. 1A and C), and Ysrl72 is only expressed 
in Y. pestis. In contrast, the effect of the Ahfq mutation in Y. pes- 
tis was not completely consistent between the KIM and C092 
strains. Specifically, Ysrl70 expression is less dependent upon hfq 
in Y. pestis C092 than in KIM, Ysrl72 expression is not depen- 
dent upon hfq in Y. pestis C092 (fully dependent in KIM), and 
hfq suppresses Ysrl79 expression at 28°C in Y. pestis KIM but not 
C092. Thus, some of the differences in sRNA expression between 
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Table 1. List of validated sRNAs 



sRNA name 3 


Start coordinate 13 


End coordinate 13 


Gene contexts 




Ysr151/RnpB 


125535 


125684 


y0115 > / < / < y0117 




Ysr88/152 


143205 


143274 


y0129 (gltD) > / > /v0130 > 




Ysr148/153/GlmZ 


421034 


421106, 421107 


< y0376/ > /y0377 (hemY) > 




Ysr7/154/MicA 


999263 


999189 


y0887 (gshA) > / < /y0888 > 




Ysr155/RvfD 


1021180 


1021317 


v0913 > / > /v0914 (cIdB) > 




Ysr156/Ffs 


1179191 


1179276 


< y1045/ > / < y1046 (hha) 




Ysr145/157 


1229994 


1230059, 1230060 


< y1090/ > / < y1091 




Ysr158 


1341830 


1341 894 


y1196 (ubiF) > / > / < yt030 (tRNA-GIn) 




Ysr159/CvaR 


1527324 


1527218 


< y1378/ < / < y1379 




Ysr23/160 


1593088, 1593089 


1593141, 1593147, 1593149 


y1436 > / > / < y1437 




Ysr161 


1624496 


1624402, 1624433 


< v1465 (nanTV < /v1466 (cvsP) > 




Ysr163 


1902373 


1902412 


< y1720/ > /y1722 > 




Ysr164 


1916071, 1916098 


1915948 


y1731 > / < /y1732 (ypeR) > 




Ysr165 


1961869 


1961985 


V1782 > / > /v1783 > 




Ysr11/166/FnrS 


2195768 


2195887 


y1994 (zntB) > / > / < y1995 




Ysr167 


2363578 


2363674 


< v2138/ > /v2139 (rstA) > 




Ysr73/169 


2457081 


2457002 


< v2228/ < / < v2229 




Ysr170 


2550744 


2550868, 2550869 


y2316 > / > /y2317 > 




Ysr171 


2815750 


2815620 


V2553 (manZ) > / < /v2554 > 




Ysr172 


2841229 


2841287, 2841312 


< y2579/ > / < y2580 




Ysr174 


3097172 


3097124 


< y2796 (serS)/ < / < y2797 




Ysr65/175 


3111993 


3111905 


< y2808/ < / < y2809 (clpA) 




Ysr177 


3330179 


3330055 


< y3023 (moaA)/ < /y3024 > 




Ysr179/CsrB 


3457464 


3457752 


< y3145/ < / < y3146 (syd) 




Ysr45/180/GcvB 


3467030 


3467076 


< y3154 (gcvA)/ > /y31 55 > 




Ysr149/181 


3501524, 3501536 


3501452,3501453 


<y3181 (aas)/</<y3182 




Ysr182/6S RNA 


3632001, 3632002 


3632114 


y3299 > / > /y3300 > 




Ysr183/SroG 


3904566 


3904637 


< y3520 (ribB)/ < /y3522 > 




Ysr185/Spot 42 


4230754 


4231124 


<y3808/</y3809(engB)> 




Ysr186/CsrC 


4232923 


4233049 


<y3810/>/<y3811 (polA) 




Ysr1 46.2/1 87 


4314248 


4314315 


< y3871/ > /y3872 (ompR) > 





"For sRNAs identified by Koo et al., the existing name is indicated first, followed by our systematic name. Names off. coli homologs are also indicated 
where appropriate, e.g., Ysr148/153/GlmZ was identified by Koo et al. as Ysr148, our systematic name is Ysr153, and it is homologous to £ coli GlmZ. 
Underlined names indicated sRNAs that have not been identified previously in Yersinia species. b Bold text indicates ends identified using Deep RACE 
techniques. Other end coordinates are estimates derived manually from the RNA-seq data. 'Underlined genes are overlapped by the corresponding 
sRNA. 



Y. pestis and Y. pseudotuberculosis are likely due to strain-specific 
effects. Consistent with this, our previous study revealed differ- 
ences in the growth dependence on hfq between Y. pestis KIM and 
Y. pestis C092. 12 Nevertheless, there are clear differences in sRNA 
expression patterns between Y. pestis and Y. pseudotuberculosis that 
are conserved in all strains tested. 

We have previously shown that hfq is important for growth 
of Y. pestis but not Y. pseudotuberculosis, when cells are cultured 
at 37°C but not at 28°C. 12 Hence, any sRNAs that show expres- 
sion differences between 28-37°C, between wild-type and Ahfq 
cells or between Y. pestis and Y. pseudotuberculosis, are poten- 
tially associated with the unique biology of Y. pestis. As described 
above, we observed many temperature-dependent differences in 



sRNA expression. In some cases, the effect of deleting hfq was 
specific to one temperature, e.g., Ysrl79/CsrB (Fig. 1), indicating 
complex interactions between temperature and hfq dependence. 
Differences in expression of sRNAs in Y. pestis between 28— 37°C 
could contribute to differences in gene expression for bacteria in 
a flea vector and bacteria in a mammalian host. Furthermore, 
differences in sRNA expression patterns between Y. pestis and 
Y. pseudotuberculosis could contribute to the physiological differ- 
ences between these species. This is particularly likely for Ysrl72, 
which is undetectable in Y. pseudotuberculosis (Fig. 1A and C). 

Mapping of sRNA ends using Deep RACE. We developed 
genome-scale 5' and 3' RACE approaches to precisely determine 
the ends of the sRNAs confirmed by Northern analysis (Fig. 2A 
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Y. pestis 



Y. pseudotuberculosis 



28"C 



37°C 



28°C 



37°C 




Ysr45/180/GcvB 

Mj 



B 



y pesf/s 



Y. pseudotuberculosis 



Cluster #1 



Cluster #2 
Cluster #3 
Cluster #4 

Cluster #5 

Cluster #6 
Cluster #7 I 



■■■■ 

■□■□cn 

□□□SEES 

□□□□□□ 



□□□□□□ 



■□□ZZI^^I 



□ □E 



□ I 
□I 

I II II L 



SSEl Ysr73/169 
Ysr65f175 



Ysr155/Ry1D 




90-100% 


Ysr156/Ffs 




80-90% 


Ysr161 
Ysr163 




70-80% 


Ysr177 




60-70% 


Ysr45/180/GcvB 




50-60% 


Ysr182/6S RNA 




40-50% 


Ysr183/SroG 




30-40% 


Ysr146 2/187 








20-30% 


Ysr88/152 




10-20% 




Ysr11/166/FnrS 




0-10% 



□ Ysr185/Spot42 
■ ■□L-IBH - ! Ysr186/CsrC 

I II I I !■! M Ysr145/157 
Ysr159/CyaR 
Vsr164 

II ■! II IM Ysr149/181 



H^HHH I I I Ysr151/RnpB 

□□□□□□ Ysr148/153/GlmZ 
□ □□■■CTJ Ysr7/154/MicA 
■ Ysr158 

YsrUZIrmf 



□□□I 



■ I I Ysr167 

I I I Ysr170 

!□ Ysr171 

M~~l Ysr174 



CZCHCZCZIMrrZ] Ysr23/160 
Ysr165 



Y. pestis C092 

I Ir 

28°C 37°C I 



V pseudotuberculosis 



PTB51 PTB57 PTB52 PTB54 



& 4 4 $ # 4 4 * 



Ysr170 



Ysr172 



Ysr179/CsrB 




Figure 1. (A) Verification of Ysr expression by northern blot analysis. 
All northern blots are shown in Figure SI and duplicate northern 
blots for most sRNAs are shown in Figure S4. A northern blot for 
hfq mRNAfrom a corresponding set of RNA samples is also shown 
(note that this blot is reproduced as part of Fig. S2). Corresponding 
5S rRNA northern blots forfour of the same membranes are shown 
in Figure S6. (B) Clusters of sRNA based on /c-means clustering of 
sRNA expression data. Rows correspond to individual RNAs, while 
columns correspond to conditions for which expression was mea- 
sured by northern blot. Shading indicates the relative expression of 
sRNAs for each strain/condition. Expression numbers are indicated 
as a percentage of the level for condition in which the RNA level is 
highest. (C) Northern analysis of Ysr170, Ysr172 and Ysr179/CsrB in 
additional strains of Y. pestis and Y. pseudotuberculosis. Duplicate 
northern blots are shown in Figure S5. Note that Y. pseudotubercu- 
losis PTB52c is the same strain as used in (A). 



and B) because our deep sequencing data do not allow for 
such precise mapping. These methods combine conventional 
RACE with deep sequencing using the Ion Torrent platform 
(any deep sequencing platform would suffice). In addition 
to allowing for simultaneous analysis of many RNAs, these 
methods produce multiple sequence reads for each indi- 
vidual sRNA (e.g., 1,945 sequence reads for Ysr23/160). 
This allows us to identify multiple 5' ends and to accurately 
determine the relative abundance of each. We propose that 
these methods be named "Deep 5' RACE" and "Deep 3' 
RACE." Using these methods, we successfully mapped the 
5' ends of 18 sRNAs (and rm/mRNA) and the 3' ends of 
28 sRNAs (and rm/mRNA). The major 5' and 3' ends for 
these sRNAs are listed in Table 1 and raw data are provided 
in Tables SI and 2. Four representative examples are shown 
in Figure 2C— F. For the sRNAs for which we mapped both 
unique 5' and 3' ends, the median length is 84 nt. In most 
cases we detected unique ends, but some RNAs have mul- 
tiple 5' ends, e.g., Ysrl49/181 (Fig. 2E; Table 1). In addition, 
many sRNAs have multiple 3' ends clustered around a sin- 
gle location, e.g., Ysrl48/153/GlmZ (Fig. 2C), Ysrl49/181 
(Fig. 2E), Ysrl7/154/MicA (Fig. 2F; Table 1). Most sRNAs 
are located entirely within intergenic regions but some 
overlap the ends of adjacent genes, e.g., Ysrl65 (Fig. 2D; 
Table 1). Our Deep 5' RACE method is very similar to a 
previously described method, "Deep-RACE." 24 To the best 
of our knowledge, no method equivalent to Deep 3' RACE 
has been described previously. Given the increasing avail- 
ability of deep sequencing, we anticipate that these methods 
will become widespread for the large-scale identification of 
RNA 5' and 3' ends. 

Y. pestis sRNAs fall into multiple classes based on 
overlap with annotated genes. Mapping of 5' and 3' ends 
revealed multiple classes of sRNA. The major class is inter- 
genic, i.e., no overlap with annotated genes. Based on our 
knowledge of equivalent sRNAs in other bacterial species, 
we anticipate that the majority of intergenic sRNAs func- 
tion as regulators by base-pairing with distally encoded 
mRNAs. In contrast to the intergenic sRNAs, seven of the 
sRNAs overlap an annotated protein-coding gene. In some 
cases, this may be an artifact of incorrect gene annotation. 



400 



RNA Biology 



Volume 10 Issue 3 



However, several of the overlapped genes have well- 
described functions. Two of the sRNAs, Ysrl67 and 
Ysrl71, overlap annotated genes in the antisense ori- 
entation. These sRNAs may be responsible for regu- 
lation of the overlapping gene, as has been observed 
previously for antisense RNAs in other species. 25 
Four of the sRNAs, Ysr88/152, Ysrl55/RyfD, Ysrl6l 
and Ysrl65, overlap the 5' end of an annotated gene, 
in the sense orientation. These sRNAs may include 
riboswitches which can generate short RNAs at the 
start of genes by promoting transcription attenuation 
or RNA processing. One sRNA, Ysr73/169, overlaps 
the 3' end of a gene in the sense orientation. This, 
and other sRNAs that overlap annotated genes in the 
sense orientation, may be processed fragments of the 
mRNAs that they overlap. 

We determined whether any of the sRNAs might 
be protein-coding. Specifically, we translated sRNAs 
in silico and searched for open reading frames of > 
30 amino acids that have significant sequence identity 
with proteins annotated in other species. We identi- 
fied one RNA, initially named Ysrl73, which encodes 
a homolog of E. coli Rmf ribosome modulation fac- 
tor. 26 We note that rmfis not annotated for Y. pestis but 
is annotated for Y, pseudotuberculosis. Other sRNAs 
might also be protein-coding, as has been observed 
for some sRNAs in E. coli. 27 This is particularly true 
for Y. pestis, which has a less well-annotated genome 
as compared with E. coli. Indeed, we found many dif- 
ferences between the gene annotations available for 
Y. pestis KIM from different databases. Strikingly, rmf 
mRNA levels in Y. pestis (but not Y. pseudotuberculo- 
sis) are dependent upon the presence of hfq (Fig. IB; 
Fig. SI), indicating that the effects of Hfq specific 
to Y. pestis are not limited to non-coding RNAs. rmf 
abundance may be controlled by direct association of 
Hfq. Alternatively, rmf 'may be regulated by an sRNA 
in an Hfq-dependent manner. 

Sequence conservation of sRNAs between 
Y. pestis, Y. pseudotuberculosis, Y. enterocolitica 
and E. coli. We used BLAST to search for sequence 
conservation between each of the sRNAs and the 
genomes of Y. pseudotuberculosis, Y. enterocolitica and 
E. coli. Specifically, we searched using the sequence beginning 100 
bp upstream and ending 100 bp downstream of the coordinates 
identified in the initial deep sequencing experiment. A summary 
of this analysis is shown in Table 2. For sequences with matches 
in any of these species, we performed alignments using ClustalW 
(Fig. S3). In all cases, sRNA sequences in Y. pestis and Y. pseu- 
dotuberculosis were extremely similar, as expected due to the high 
sequence identity between these species. In most cases, nucleotide 
identity was > 99%, including for Ysrl72 which is not detect- 
ably expressed in Y. pseudotuberculosis (Fig. 1). All but two of the 
sRNAs are conserved in Y. enterocolitica, suggesting that they rep- 
resent a core set of sRNAs for this genus. Only 14 of the sRNAs are 
conserved in E. coli. Given that most of the E. coli homologs have 
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Figure 2. Representative examples of 5' and 3' Deep RACE data. (A) Schematic for the 
Deep 5' RACE method. (B) Schematic for the Deep 3' RACE method. (C) Deep 5' RACE 
data (blue) and Deep 3' RACE data (green) for four selected sRNAs. Flanking anno- 
tated genes are indicated by gray boxes, which are positioned above (plus) or below 
(minus) the horizontal line to denote strand. 



been characterized, conservation of sRNAs between E. coli and Y. 
pestis provides insight into the function of these sRNAs in Y. pes- 
tis. Several were only partially conserved, suggesting that their 
functions have diverged between the two species. Three Y. pestis 
sRNAs that did not generate a BLAST match in E. coli are located 
in the same gene context (i.e., same synteny with flanking genes) 
as known E. coli sRNAs, e.g., Ysrl85/Spot 42. We propose that 
these sRNAs are shared between the two species but have diverged 
extensively with respect to their mRNA targets. As described pre- 
viously for MicF (one of the 18 putative sRNAs that failed the 
northern blot analysis), some sRNAs conserved between Y. pestis 
and E. coli are conserved only over short stretches of sequence that 
are known to be required for base-pairing with targets in E. coliP 
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Table 2. Summary of alignments of nucleotide sequences of identified RNAs 





Y. pseudotuberculosis' 3 


Y. enterocolitica 




E. coli 


sRNA* 


% idpntitv c 




% idpntitv c 


/U wVCIuUC 


% identity 1 


% coverage d 


Ysr151/RnpB 


99 


100 


92 


100 


89 


85 


Ysr88/152 


100 


100 


83 


64 


N/A 


N/A 


Ysr148/153/GlmZ 


100 


79 


72 


61 


N/A 


N/A 


Ysr7/154/MicA 


99 


100 


86 


99 


80 


33 


Ysr155/RyfD 


100 


100 


91 


100 


79 


93 


Ysr156/Ffs 


99 


100 


92 


72 


84 


40 


Ysr145/157 


99 


100 


79 


100 


N/A 


N/A 


Ysr158 


99 


100 


69 


40 


N/A 


N/A 


Ysr159/CyaR 


99 


100 


83 


73 


N/A 


N/A 


Ysr23/160 


98 


100 


73 


52 


N/A 


N/A 


Ysr161 


93 


100 


82 


55 


N/A 


N/A 


Ysr163 


82 


89 


N/A 


N/A 


N/A 


N/A 


Ysr164 


99 


96 


78 


38 


N/A 


N/A 


Ysr165 


99 


100 


90 


52 


80 


30 


Ysr11/166/FnrS 


100 


100 


87 


57 


83 


41 


Ysr167 


99 


100 


79 


82 


N/A 


N/A 


Ysr73/169 


99 


100 


76 


100 


N/A 


N/A 


Ysr170 


100 


100 


N/A 


N/A 


N/A 


N/A 


Ysr171 


100 


100 


84 


69 


79 


42 


Ysr172 


100 


78 


87 


22 


83 


25 


Ysr173/rmf" 


100 


76 


89 


75 


74 


72 


Ysr174 


100 


100 


82 


54 


81 


25 


Ysr65/175 


100 


66 


93 


36 


N/A 


N/A 


Ysr177 


99 


100 


79 


100 


N/A 


N/A 


Ysr179/CsrB 


100 


85 


87 


85 


N/A 


N/A 


Ysr45/180/GcvB 


100 


100 


94 


75 


80 


75 


Ysr149/181 


100 


100 


91 


68 


N/A 


N/A 


Ysr182/6S RNA 


99 


100 


89 


79 


78 


78 


Ysr183/SroG 


99 


100 


92 


58 


70 


52 


Ysr185/Spot42 


96 


100 


76 


98 


N/A 


N/A 


Ysr186/CsrC 


99 


83 


89 


82 


87 


40 


Ysr146.2/187 


99 


100 


90 


68 


N/A 


N/A 



Underlined RNAs were predicted in a bioinformatic study. 15 The Y. pseudotuberculosis 32953 strain was used as a reference genome since no reference 
genome sequence exists for PTB52c. c % identity of sRNA between Y. pestis and representative organism. d % coverage of BLAST search of sRNA 100 bp 
upstream and downstream. 



We propose that these sRNAs share a "core" set of mRNA targets 
across the Enterobacteriaceae that rely on the conserved sequence 
for base-pairing, but also have species-specific mRNA targets. In 
a few cases, e.g., Ysrl65, Ysrl72, sequence identity with E. coli was 
found only for the sequence flanking the sRNA ends. Thus, the 
sequence similarity is unlikely to reflect functional conservation 
of the sRNA. 

Given that almost all of the sRNAs and their flanking sequence 
are > 95% identical between Y. pestis and Y. pseudotuberculosis, 
what accounts for the differences in expression patterns? One 
possibility is that Hfq functions differently in the two species. 
Hfq has been implicated previously in promoting stability of 



many sRNAs in other bacterial species. 7 However, Hfq is 100% 
identical at the amino acid sequence level between Y. pestis and 
Y. pseudotuberculosis, expression of hfq mRNA is similar between 
the two species (Fig. 1A) and the majority of sRNAs are 100% 
identical between the two species. Hence, it is unlikely that dif- 
ferential binding of Hfq alone contributes to the differences in 
expression patterns. Transcription of the sRNAs may be regu- 
lated differently between the two species, although any such dif- 
ferences would be due to ^raw-acting factors since the sequences 
upstream of the sRNAs are also well-conserved between the two 
species. Our preferred model is that mRNA target availability 
determines the stability of the sRNAs. Thus, differences in the 
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abundance of mRNA targets for each sRNA between Y. pestis 
and Y. pseudotuberculosis would alter the dynamics of base-pair- 
ing, in turn, altering the susceptibility of sRNAs to degradation. 
Consistent with this hypothesis, pairing of sRNAs to their target 
mRNAs has been shown previously to promote sRNA degrada- 
tion. 29 Differences in mRNA target abundance could also impact 
the size of the available pool of Hfq, which may be limiting. 30 

A previous study of sRNAs in E. coli identified likely homologs 
in Y. pestis based on sequence conservation. 15 Eight of these pre- 
dictions are consistent with the sRNAs we identified (Table 2). 
However, we did not detect RNA-seq signal for several predicted 
sRNAs. This is likely due to the fact that the equivalent E. coli 
sRNAs are expressed under stress conditions, e.g., RprA expres- 
sion is induced during stationary phase. 31 Therefore, we propose 
that many sRNAs were not detected in our study due to condi- 
tion-specific expression patterns. 

Comparison to another sRNA study in a Yersinia species. A 
recent study used deep sequencing to identify 150 putative sRNAs 
in Y. pseudotuberculosis. 23 Surprisingly, only 17 of the 31 confirmed 
sRNAs that we identified are shared with the list of putative 
sRNAs identified by Koo et al. Thus, there are substantial dispari- 
ties between the two studies. We propose the following explana- 
tions for these disparities: (1) six of the putative sRNAs identified 
by Koo et al. are not conserved in Y. pestis; (2) some of the putative 
sRNAs identified by Koo et al. may have escaped detection in our 
study by virtue of generating an insufficient number of sequence 
reads in the initial deep sequencing analysis. This is likely to be 
the case for tmRNA, for which we detect > 100 sequence reads 
but fewer than 500, the cut-off we used. However, for many of the 
sRNAs identified by Koo et al., we detect few or no sequence reads 
in the corresponding location in Y. pestis; (3) in a few cases, putative 
sRNAs identified by Koo et al. were specific to a particular growth 
phase that differs from the conditions in our study. These sRNAs 
would likely be missed by our approach; (4) only 29 of the putative 
150 sRNAs identified by Koo et al. were successfully validated by 
Northern analysis and/or RACE. Hence, it is possible that many 
of the remaining 121 candidates would be below our detection 
threshold by Northern analysis or are false positives. Consistent 
with this, 20 of the 49 putative sRNAs tested were undetectable by 
northern analysis. 23 Furthermore, the 49 putative sRNAs selected 
for northern analysis by Koo et al. have considerably higher expres- 
sion levels, based on the deep sequencing data, than those that 
were not tested (median number of sequence counts -7-fold higher 
for those tested by northern analysis). This greatly increases the 
likelihood that the untested putative sRNAs would be below the 
detection threshold of northern analysis or are false positives; (5) 
it is possible that there is a much larger pool of sRNAs and the 
two studies have each identified a subset of that pool. Consistent 
with this, deep sequencing has been used to identify > 500 putative 
sRNAs in Vibrio cholerae 16 and > 300 putative sRNAs in E. coli; 14 
(6) Koo et al. identified sRNAs in Y. pseudotuberculosis whereas 
we identified sRNAs in Y. pestis. Although the DNA sequences 
corresponding to almost all these putative sRNAs are highly con- 
served between the two species, we have shown that sRNA expres- 
sion patterns vary considerably between the two species. Hence, it 
is likely that many of the putative sRNAs identified by Koo et al. 



are well expressed in Y. pseudotuberculosis but would be below the 
limit of detection, either by deep sequencing or Northern analysis, 
in Y. pestis. 

Our northern blot data confirm the existence in Y. pestis of 
13 of the 101 putative sRNAs identified by Koo et al. but not 
tested by northern blot in that study. Furthermore, our data pro- 
vide the first characterization of the expression of these 13 sRNAs 
in Y. pestis and Y. pseudotuberculosis. The 50 putative sRNAs we 
identified in Y. pestis by deep sequencing include relatively few 
false positives, as evidenced by the high success rate when testing 
by northern blot. This is likely due to the stringent cut-off used 
for assignment as a putative sRNA. In contrast, the 150 putative 
sRNAs identified by Koo et al. likely include many false posi- 
tives. It is also likely that the list of putative sRNAs identified by 
Koo et al. has many fewer false negatives than our list. By careful 
comparison of the data sets from both studies, it may be possible 
to prioritize additional putative sRNAs for validation. 

Materials and Methods 

Strains and growth conditions. Strains of Y. pestis used in this study 
were KIM6+ (Pgm + pCDE pMTl* pPst + ), 32 KIM6+ Ahfqxat, 12 
KIM6+ A/y^-multi-copy complementation :kan (Ahfq-C), 12 
C092 33 and C092 Ahfqxat} 2 Strains of Y. pseudotuberculosis 
used were PTB52c WT (pYV; serotype IB; YP-HPE), 34 PTB52c 
Ahfqxat} 2 PTB52c AA/^-multi-copy complementation :kan 
(Ahfq-Q, 12 PTB51c (pYV; serotype IB; YP-HPE), 34 PTB57c (pYV; 
serotype III; YP-HPI) and PTB54c (pYV; serotype III; YP-HPI). 34 
Construction of mutant strains and growth conditions used in this 
study have been described previously. 12 All strains of Y. pestis and Y. 
pseudotuberculosis were grown in brain heart infusion (BHI) media. 
To obtain cells for RNA isolation, 5 ml of BHI was diluted 1:5 with 
overnight culture and cells were grown for 4 h at 28°C or 37°C. 
Cells were then harvested at 4°C and stored at -80°C. 

RNA isolation and purification. Cells were resuspended in 
1 ml TRIzol (Invitrogen), incubated at room temperature for 
5 min and centrifuged at 12,000 x g for 10 min at 4°C (all sub- 
sequent centrifugations were performed at this temperature) . The 
supernatant was transferred to a new microfuge tube and 200 (xl 
chloroform:isoamyl alcohol (24:1 ratio) was added. The sample 
was shaken vigorously for 15 sec, incubated at room temperature 
for 3 min and centrifuged at 12,000 x g for 15 min. The aqueous 
phase was transferred to a new microfuge tube where 500 (xl isopro- 
panol was added and incubated for 10 min at room temperature. 
Following centrifugation at 12,000 x g for 10 min, the supernatant 
was decanted and washed with 1 ml ice-cold 75% ethanol and cen- 
trifuged at 7,600 x g for 5 min. The supernatant was decanted, and 
the residual supernatant was removed by pipette. The RNA pellet 
was air-dried, then resuspended in 30 jjlI of RNase-free water. 

Resulting RNA was treated by DNase I (New England Biolabs) 
to remove any remaining DNA. A total of 10 ixl DNase I was uti- 
lized in a final volume of 500 (Jtl and incubated for 1 h at 37°C 
at which point 600 ixl of isopropanol was added and precipitated 
overnight at -80°C. Following precipitation, the RNA was cen- 
trifuged at 12,000 x g for 20 min. The supernatant was decanted 
and washed with 1 ml ice-cold 75% ethanol and centrifuged 
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at 7,600 x g for 5 min. The supernatant was decanted, and the 
residual supernatant was removed by pipet. The RNA pellet was 
air-dried, then resuspended in 30 ui of RNase-free water. RNA 
was used for initial deep sequencing screening and replicate RNA 
samples were pooled, quantitated and aliquoted into microfuge 
tubes for use in Northern analysis. 

Initial deep sequencing and characterization of sRNAs. 
Isolated RNA from Y. pestis KIM6+ grown at 37°C was sepa- 
rated on a 6% denaturing polyacrylamide gel and RNA below 
400 nt was cut from the gel. The resulting RNA was electro- 
eluted using dialysis tubing following washing with 1 ml of 0.1 X 
TBE. Electro-elution was run at 100 V for 30 min. The resulting 
TBE was ethanol precipitated and the resulting RNA was exam- 
ined on an agarose gel to ensure there was no residual rRNA. A 
cDNA library was constructed using the Illumina RNA-seq kit, 
following the manufacturer's instructions except that DNA was 
gel-purified from 200 bp and above rather than 300 bp before the 
PCR amplification step. This modification increased the likeli- 
hood of identifying sRNAs. 

The DNA library was sequenced using an Illumina Genome 
Analyzer II (Harvard Medical School). Reads were mapped to 
the Y. pestis KIM genome using Bowtie with default settings. 35 
Sequences were piled up to determine the number of sequence 
reads that mapped to each nucleotide of the genome. Putative 
sRNAs were identified as regions of contiguous sequence that par- 
tially or fully overlap an intergenic region (JCVI genome anno- 
tation) with at least one position with > 500 mapped sequence 
reads. Fully antisense RNAs could not be identified due to the 
lack of strand information in the sequencing data. 

Northern transfer and hybridization. For Northern analysis, 
a total of 15 (xg of RNA was separated on a 1.5% formaldehyde 
MOPS gel and transferred to a nylon membrane by capillary action. 
Hybridization was performed using 60-mer oligonucleotides 
(Table S3) that were 7 32 P-ATP end-labeled with T4 polynucleotide 
kinase (Fermentas) for 1 h at 37°C. Membranes were hybridized 
at 42°C for 2 h in Amersham Rapid-hyb Buffer (GE Life Sciences) 
and were washed as per manufacturer's protocol. Densitometric 
quantitation of Northern Blots was performed using ImageQuant 
software and Sum Above Background calculations. Percent values 
indicated in Figure IB are normalized to the condition with the 
highest signal. Four blots were reprobed with an oligonucleotide 
specific to 5S rRNA, as a loading control (Fig. S6). 

Deep 5' and 3' RACE and computational analysis. RACE 
experiments were performed using RNA from Y. pestis KIM6+ 
RNA isolated at 37°C and the FirstChoice® RLM-RACE Kit 
(Ambion). For 5' RACE, we used a modified 5' RLM-RACE 
Protocol. A total of 8 ixg RNA was treated with tobacco acid pyro- 
phosphatase (TAP) at 37°C for 1 h, followed by ligation of the 5' 
RACE adaptor 37°C for 1 h. The resulting RNA was then reverse 
transcribed according to the manufacturer's protocol. PCR was 
then performed on the resulting cDNA using a primer containing 
the Ion Torrent sequence for 5' RACE amplification and primers 
specific for each sRNA confirmed by northern analysis (Table S4). 
For some sRNAs, the PCR had to be re-amplified using a portion 
of the original reaction and the same primers, which then resulted 
in successful amplification of bands for all sRNAs. Resulting PCR 



products were purified using the QIAquick PCR Purification 
Kit and eluted in "Low TE" (10 mM Tris, 0.1 mM EDTA). 
All products were then pooled together and sent for Ion Torrent 
deep sequencing using a 314 chip (Wadsworth Center Applied 
Genomic Technologies Core Facility). 

For 3' RACE, we utilized the miScript Reverse Transcription 
Kit (Qiagen) to perform reverse transcription on Y. pestis KIM6 + 
RNA isolated at 37°C according to the manufacturer's protocol. 
PCR was performed on the resulting cDNA using the primers 
containing Ion Torrent sequence for 3' RACE (universal primer) 
and each sRNA. Resulting PCR products were purified using the 
QIAquick PCR Purification Kit, eluted in "Low TE," pooled and 
sent for Ion Torrent deep sequencing using a 314 chip (Wadsworth 
Center Applied Genomic Technologies Core Facility). 

Any sequences lacking the expected 5' or 3' adaptor sequences 
were removed. We then extracted non-adaptor sequence from the 
remaining reads and mapped them to the Y. pestis KIM genome 
using BWA with default parameters. 36 Reads were assumed to be 
associated with an sRNA if they were located within 1 kbp of the 
predicted location and were located on the predicted strand. 

Clustering analysis of Ysrs verified by northern blot analy- 
sis. To assign expression patterns of Ysrs into groups, we used 
£-means clustering to partition the sRNAs into k = 9 clusters, 
selecting k by minimizing the value of the Kelley penalty. 37 These 
nine groups were manually adjusted to seven for clarity. 

Conservation analysis. Sequences from 100 bp upstream to 
100 bp downstream of each sRNA (coordinates derived from the 
original Illumina sequencing data) were used to search against 
the Y. pseudotuberculosis 32953, Y. enterocolitica 8081 and E. coli 
K-12 (MG1655) strains using BLAST with the default param- 
eters. 38 BLAST matches were realigned using ClustalW. 39 

Conclusions 

In summary, we have identified 32 sRNAs in Y. pestis, of which 
14 are novel, and 1 1 of these 14 have no known E. coli homolog. 
Based on the patterns of sRNA expression and on the differences 
between sRNA expression in Y. pestis and Y. pseudotuberculosis, 
we propose that many of these sRNAs contribute to the unique 
biology of Y. pestis, and may play important roles in virulence. 
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